Frequent Sets, Sequences, and Taxonomies: New, Eecient Algorithmic Proposals
نویسنده
چکیده
We describe eecient algorithmic proposals to approach three fundamental problems in data mining: association rules, episodes in sequences, and generalized association rules over hierarchical taxonomies. The association rule discovery problem aims at identifying frequent itemsets in a database and then forming conditional implication rules among them. For this association task, we will introduce a new algorithmic proposal to reduce substantially the number of processed transactions. The resulting algorithm, called Ready-and-Go, is used to discover frequent sets eeciently. Then, for the discovery of patterns in sequences of events in ordered collections of data, we propose to apply the appropriate variant of that algorithm, and additionally we introduce a new framework for the formalization of the concept of interesting episodes. Finally, we adapt our algorithm to the generalization of the frequent sets problem where data comes organized in taxonomic hierarchies, and here additionally we contribute with a new heuristic that, under certain natural conditions, improves the performance.
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملOn Lacunary Statistical Limit and Cluster Points of Sequences of Fuzzy Numbers
For any lacunary sequence $theta = (k_{r})$, we define the concepts of $S_{theta}-$limit point and $S_{theta}-$cluster point of a sequence of fuzzy numbers $X = (X_{k})$. We introduce the new sets $Lambda^{F}_{S_{theta}}(X)$, $Gamma^{F}_{S_{theta}}(X)$ and prove some inclusion relaions between these and the sets $Lambda^{F}_{S}(X)$, $Gamma^{F}_{S}(X)$ introduced in ~cite{Ayt:Slpsfn} by Aytar [...
متن کاملRelationship-aware sequential pattern mining: results on medical practise on antibiotic treatment and resistance development
Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...
متن کاملRelationship-aware sequential pattern mining
Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...
متن کاملA computational method to analyze the similarity of biological sequences under uncertainty
In this paper, we propose a new method to analyze the difference and similarity of biological sequences, based on the fuzzy sets theory. Considering the sequence order and some chemical and structural properties, we present a computational method to cluster the biological sequences. By some examples, we show that the new method is relatively easy and we are able to compare the sequences of arbi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000